huge httpd processes

am 29.09.2009 21:40:16 von Justin Wyllie

This is a multi-part message in MIME format.

------=_NextPart_000_006E_01CA4145.0D81A6F0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hi clint

Yes. Linux and this script looks good. We've think that part of the =
problem=20
is in the modules Apache is loading so this will be useful.

I also have another couple of questions:

I have found the errant code where our process jumps by 13 Mbs. One part =

does something like this:

$file_handle->read($s, $length); #$s is about 1/2 Mb
@data =3D unpack($format , $s);
##at this point memory usage jumps by 8 Mbs (measured using GTop->size() =
)

while (@data) {
push @data2, [shift @data, shift @data, shift @data] ; # this isn't =
exact=20
but it looks like each element of @data2 becomes a reference to a 3 =
element=20
array - i.e the binary data was stored in triplets
}
#this loop causes another jump of 4 Mbs

return \@data2;

I tried undef'ing @data just before the return as it is no longer used =
but=20
this only gained me 1/2 Mb. I would have expected to get all 8Mbs back. =
I=20
don't understand why not.

Also - in general terms if you do something like this (simplified):

package MyHandler;

use MyClass;

sub handler {
my $obj =3D MyClass->new();
}

........
package MyClass;

our $var;
sub new() {
$var =3D "hello world";
}

\Since the module containing the package MyClass is loaded into the=20
apache/mod_perl process does $var ever go out of scope once set? I think =

not - and it's memory is never freed? if this is correct and I used my=20
instead even then would it go out scope ?

Thank you for your patience.

regards

Justin

> Hi Justin
>
>>
>> I'm wondering if anyone can advise me on how I could go about trying
>> to understand where this 90 Mbs is comming from? Some of it must be
>> the mod_perl and apache binaries - but how much should they be, and
>> apart from the 6mb in shared memory for my pre-loaded modules, where
>> is it all comming from?
>
> You don't mention what platform you are on, but if it is linux, then
> check out this blog
>
> http://bmaurer.blogspot.com/2006/03/memory-usage-with-smaps. html
>
> and specifically this script
>
> http://www.contrib.andrew.cmu.edu/%7Ebmaurer/memory/smem.pl
>
>
> which will give you a much more accurate idea about how much memory is
> shared and where it is being used.
>
> As Perrin said, Perl data and code both go on the heap, so you won't =
be
> able to separate those two out with this tool, but combining smem.pl
> with loading modules one by one will get you a long way to a =
diagnosis.
>
> clint=20

------=_NextPart_000_006E_01CA4145.0D81A6F0
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

http-equiv=3DContent-Type>

Roman">Hi=20
clint

Yes. Linux and this script looks good. We've think that =
part of the=20
problem
is in the modules Apache is loading so this will be =
useful.

I=20
also have another couple of questions:

I have found the errant =
code where=20
our process jumps by 13 Mbs. One part
does something like=20
this:

$file_handle->read($s, $length); #$s is about 1/2=20
Mb
@data =3D unpack($format , $s);
##at this point memory usage =
jumps by 8=20
Mbs (measured using GTop->size() )

while (@data) {
push =
@data2,=20
[shift @data, shift @data, shift @data] ; # this isn't exact =

but it=20
looks like each element of @data2 becomes a reference to a 3 element =

array -=20
i.e the binary data was stored in triplets
}
#this loop causes =
another=20
jump of 4 Mbs

return \@data2;

I tried undef'ing @data just =
before=20
the return as it is no longer used but
this only gained me 1/2 Mb. I =
would=20
have expected to get all 8Mbs back. I
don't understand why=20
not.

Also - in general terms if you do something like this=20
(simplified):

package MyHandler;

use MyClass;

sub =
handler=20
{
my $obj =3D MyClass->new();
}

.......
package=20
MyClass;

our $var;
sub new() {
$var =3D "hello=20
world";
}

\Since the module containing the package MyClass =
is=20
loaded into the
apache/mod_perl process does $var ever go out of =
scope once=20
set? I think
not - and it's memory is never freed? if this is =
correct and I=20
used my
instead even then would it go out scope ?

Thank =
you for=20
your =
patience.

regards

Justin

> =
Hi=20
Justin
>
>>
>> I'm wondering if anyone can =
advise me on=20
how I could go about trying
>> to understand where this 90 Mbs =
is=20
comming from? Some of it must be
>> the mod_perl and apache =
binaries -=20
but how much should they be, and
>> apart from the 6mb in =
shared memory=20
for my pre-loaded modules, where
>> is it all comming=20
from?
>
> You don't mention what platform you are on, but if =
it is=20
linux, then
> check out this blog
>
> =
href=3D""> face=3D"Times New =
Roman">http://bmaurer.blogspot.com/2006/03/memory-usage-with -smaps.html FONT>
size=3D3 face=3D"Times New Roman">>
> and specifically this=20
script
>
> face=3D"Times New =
Roman">http://www.contrib.andrew.cmu.edu/%7Ebmaurer/memory/s mem.pl=

size=3D3 face=3D"Times New Roman">>
>
> which will give =
you a much=20
more accurate idea about how much memory is
> shared and where it =
is being=20
used.
>
> As Perrin said, Perl data and code both go on the =
heap, so=20
you won't be
> able to separate those two out with this tool, but=20
combining smem.pl
> with loading modules one by one will get you a =
long=20
way to a diagnosis.
>
> clint =

------=_NextPart_000_006E_01CA4145.0D81A6F0--

Re: huge httpd processes

am 30.09.2009 00:16:56 von Clinton Gormley

>
> I tried undef'ing @data just before the return as it is no longer used
> but
> this only gained me 1/2 Mb. I would have expected to get all 8Mbs
> back. I
> don't understand why not.
>
Perl (as least on the OS's that I'm familiar with) doesn't release used
memory back to the OS.

Have a look at:
- http://www.perlmonks.org/?node_id=333464
- http://www.perlmonks.com/index.pl?node_id=126591

for more discussion on the matter.

But basically, try to avoid loading all that data in your mod_perl
process.
>
> Also - in general terms if you do something like this (simplified):
>

> our $var;
> sub new() {
> $var = "hello world";
> }
>
>
> \Since the module containing the package MyClass is loaded into the
> apache/mod_perl process does $var ever go out of scope once set? I
> think
> not - and it's memory is never freed? if this is correct and I used
> my
> instead even then would it go out scope ?

It wouldn't go out of scope because it's scope continues to exist until
exit.

There are good reasons for using variables at package level, but don't
use them if you don't have to.

clint

Re: huge httpd processes

am 30.09.2009 01:14:35 von aw

Justin Wyllie wrote:
....

>
> $file_handle->read($s, $length); #$s is about 1/2 Mb
> @data = unpack($format , $s);
> ##at this point memory usage jumps by 8 Mbs (measured using GTop->size() )
>
> while (@data) {
> push @data2, [shift @data, shift @data, shift @data] ; # this isn't exact
> but it looks like each element of @data2 becomes a reference to a 3 element
> array - i.e the binary data was stored in triplets
> }
> #this loop causes another jump of 4 Mbs
>
> return \@data2;
>
Mybe a naive question, but is $file_handle always pointing to the same
file ?

Then also, that whole logic above seems rather inefficient, both in
memory used and in overhead.
- each read() reads about 500K. So you use 500K right there.
- then these 500 K are "parsed" (by the unpack(), presumably in chunks
of a predictable size), into presumably many elements of @data. That
causes @data to be large. (Say each element is a 64-bit integer, encoded
as 8 bytes each; 500KB/8 = 64,000 elements in @data).
- then at each while iteration, @data is shifted 3 times, to extract 3
consecutive elements, creating a new 3-elements anonymous array. The
elements shifted out of @data are discarded. I would presume that Perl
is smarter than actually moving all remaining elements of @data each
time, but there is certainly some significant background work as a
result of each shift of @data.
- a reference to the 3-element array is then pushed onto @data2.
- then finally @data is discarded (or, at least, disregarded until the
next call). But the memory it used is never returned to the OS.

So if you would for instance reduce the size of each read(), you would
reduce the number of elements of @data that are produced at each
unpack(), thus keeping @data smaller, at the cost of more read()'s.

Then again, $s is a byte buffer. With the unpack, you are "chunking" it
by re-exploring the format $format over and over, building @data in the
process. But @data only serves to build these 3-element arrays to which
you want references to push into @data2. So why not unpack() the buffer
one 3-element chunk at a time, directly into a 3-element new array, and
do away with @data (and the shift()'s) altogether ?

Re: huge httpd processes

am 30.09.2009 16:04:58 von Adam Prime

Justin Wyllie wrote:
> Hi clint
>
> Yes. Linux and this script looks good. We've think that part of the problem
> is in the modules Apache is loading so this will be useful.
>
> I also have another couple of questions:
>
> I have found the errant code where our process jumps by 13 Mbs. One part
> does something like this:
>
> $file_handle->read($s, $length); #$s is about 1/2 Mb
> @data = unpack($format , $s);
> ##at this point memory usage jumps by 8 Mbs (measured using GTop->size() )

As Clinton said, perl doesn't free the memory back to the OS when you
slurp this file into ram. If you really want to free up the resources
(which will get reused by subsequent requests, they just aren't
available to the OS) you can use $r->child_terminate to make that child
die after handling that request, which will free the resources (and
likely spawn another child in it's place.

Adam